Multivariate sparse group lasso for the multivariate multiple linear regression with an arbitrary group structure.
نویسندگان
چکیده
We propose a multivariate sparse group lasso variable selection and estimation method for data with high-dimensional predictors as well as high-dimensional response variables. The method is carried out through a penalized multivariate multiple linear regression model with an arbitrary group structure for the regression coefficient matrix. It suits many biology studies well in detecting associations between multiple traits and multiple predictors, with each trait and each predictor embedded in some biological functional groups such as genes, pathways or brain regions. The method is able to effectively remove unimportant groups as well as unimportant individual coefficients within important groups, particularly for large p small n problems, and is flexible in handling various complex group structures such as overlapping or nested or multilevel hierarchical structures. The method is evaluated through extensive simulations with comparisons to the conventional lasso and group lasso methods, and is applied to an eQTL association study.
منابع مشابه
Sparse Reduced-Rank Regression for Simultaneous Dimension Reduction and Variable Selection in Multivariate Regression
The reduced-rank regression is an effective method to predict multiple response variables from the same set of predictor variables, because it can reduce the number of model parameters as well as take advantage of interrelations between the response variables and therefore improve predictive accuracy. We propose to add a new feature to the reduced-rank regression that allows selection of releva...
متن کاملSupport Union Recovery in High - Dimensional Multivariate Regression
In multivariate regression, a K-dimensional response vector is regressed upon a common set of p covariates, with a matrix B∗ ∈ Rp×K of regression coefficients. We study the behavior of the multivariate group Lasso, in which block regularization based on the `1/`2 norm is used for support union recovery, or recovery of the set of s rows for which B∗ is non-zero. Under high-dimensional scaling, w...
متن کاملRegression Performance of Group Lasso for Arbitrary Design Matrices
In many linear regression problems, explanatory variables are activated in groups or clusters; group lasso has been proposed for regression in such cases. This paper studies the nonasymptotic regression performance of group lasso using `1/`2 regularization for arbitrary (random or deterministic) design matrices. In particular, the paper establishes under a statistical prior on the set of nonzer...
متن کاملThe Performance of Group Lasso for Linear Regression of Grouped Variables
The lasso [19] and group lasso [23] are popular algorithms in the signal processing and statistics communities. In signal processing, these algorithms allow for efficient sparse approximations of arbitrary signals in overcomplete dictionaries. In statistics, they facilitate efficient variable selection and reliable regression under the linear model assumption. In both cases, there is now ample ...
متن کاملHigh-dimensional regression with unknown variance
We review recent results for high-dimensional sparse linear regression in the practical case of unknown variance. Different sparsity settings are covered, including coordinate-sparsity, group-sparsity and variation-sparsity. The emphasis is put on non-asymptotic analyses and feasible procedures. In addition, a small numerical study compares the practical performance of three schemes for tuning ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Biometrics
دوره 71 2 شماره
صفحات -
تاریخ انتشار 2015